knowledge constraint
Randomized Transport Plans via Hierarchical Fully Probabilistic Design
Y., Sarah Boufelja, Quinn, Anthony, Shorten, Robert
An optimal randomized strategy for design of balanced, normalized mass transport plans is developed. It replaces -- but specializes to -- the deterministic, regularized optimal transport (OT) strategy, which yields only a certainty-equivalent plan. The incompletely specified -- and therefore uncertain -- transport plan is acknowledged to be a random process. Therefore, hierarchical fully probabilistic design (HFPD) is adopted, yielding an optimal hyperprior supported on the set of possible transport plans, and consistent with prior mean constraints on the marginals of the uncertain plan. This Bayesian resetting of the design problem for transport plans -- which we call HFPD-OT -- confers new opportunities. These include (i) a strategy for the generation of a random sample of joint transport plans; (ii) randomized marginal contracts for individual source-target pairs; and (iii) consistent measures of uncertainty in the plan and its contracts. An application in algorithmic fairness is outlined, where HFPD-OT enables the recruitment of a more diverse subset of contracts -- than is possible in classical OT -- into the delivery of an expected plan. Also, it permits fairness proxies to be endowed with uncertainty quantifiers.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Research Report (0.50)
- Workflow (0.46)
Multilingual Knowledge Graph Completion from Pretrained Language Models with Knowledge Constraints
Song, Ran, He, Shizhu, Gao, Shengxiang, Cai, Li, Liu, Kang, Yu, Zhengtao, Zhao, Jun
Multilingual Knowledge Graph Completion (mKGC) aim at solving queries like (h, r, ?) in different languages by reasoning a tail entity t thus improving multilingual knowledge graphs. Previous studies leverage multilingual pretrained language models (PLMs) and the generative paradigm to achieve mKGC. Although multilingual pretrained language models contain extensive knowledge of different languages, its pretraining tasks cannot be directly aligned with the mKGC tasks. Moreover, the majority of KGs and PLMs currently available exhibit a pronounced English-centric bias. This makes it difficult for mKGC to achieve good results, particularly in the context of low-resource languages. To overcome previous problems, this paper introduces global and local knowledge constraints for mKGC. The former is used to constrain the reasoning of answer entities, while the latter is used to enhance the representation of query contexts. The proposed method makes the pretrained model better adapt to the mKGC task. Experimental results on public datasets demonstrate that our method outperforms the previous SOTA on Hits@1 and Hits@10 by an average of 12.32% and 16.03%, which indicates that our proposed method has significant enhancement on mKGC.
- Asia > Middle East > Iraq (0.05)
- Asia > China > Yunnan Province > Kunming (0.04)
- Asia > China > Beijing > Beijing (0.04)
- (19 more...)
Event Knowledge Incorporation with Posterior Regularization for Event-Centric Question Answering
Lu, Junru, Pergola, Gabriele, Gui, Lin, He, Yulan
We propose a simple yet effective strategy to incorporate event knowledge extracted from event trigger annotations via posterior regularization to improve the event reasoning capability of mainstream question-answering (QA) models for event-centric QA. In particular, we define event-related knowledge constraints based on the event trigger annotations in the QA datasets, and subsequently use them to regularize the posterior answer output probabilities from the backbone pre-trained language models used in the QA setting. We explore two different posterior regularization strategies for extractive and generative QA separately. For extractive QA, the sentence-level event knowledge constraint is defined by assessing if a sentence contains an answer event or not, which is later used to modify the answer span extraction probability. For generative QA, the token-level event knowledge constraint is defined by comparing the generated token from the backbone language model with the answer event in order to introduce a reward or penalty term, which essentially adjusts the answer generative probability indirectly. We conduct experiments on two event-centric QA datasets, TORQUE and ESTER. The results show that our proposed approach can effectively inject event knowledge into existing pre-trained language models and achieves strong performance compared to existing QA models in answer evaluation. Code and models can be found: https://github.com/LuJunru/EventQAviaPR.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > China (0.04)
KGS: Causal Discovery Using Knowledge-guided Greedy Equivalence Search
Learning causal relationships solely from observational data provides insufficient information about the underlying causal mechanism and the search space of possible causal graphs. As a result, often the search space can grow exponentially for approaches such as Greedy Equivalence Search (GES) that uses a score-based approach to search the space of equivalence classes of graphs. Prior causal information such as the presence or absence of a causal edge can be leveraged to guide the discovery process towards a more restricted and accurate search space. In this study, we present KGS, a knowledge-guided greedy score-based causal discovery approach that uses observational data and structural priors (causal edges) as constraints to learn the causal graph. KGS is a novel application of knowledge constraints that can leverage any of the following prior edge information between any two variables: the presence of a directed edge, the absence of an edge, and the presence of an undirected edge. We extensively evaluate KGS across multiple settings in both synthetic and benchmark real-world datasets. Our experimental results demonstrate that structural priors of any type and amount are helpful and guide the search process towards an improved performance and early convergence.
- North America > United States > Maryland > Baltimore County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Fully Probabilistic Design for Optimal Transport
Y., Sarah Boufelja, Quinn, Anthony, Corless, Martin, Shorten, Robert
The goal of this paper is to introduce a new theoretical framework for Optimal Transport (OT), using the terminology and techniques of Fully Probabilistic Design (FPD). Optimal Transport is the canonical method for comparing probability measures and has been successfully applied in a wide range of areas (computer vision Rubner et al. [2004], computer graphics Solomon et al. [2015], natural language processing Kusner et al. [2015], etc.). However, we argue that the current OT framework suffers from two shortcomings: first, it is hard to induce generic constraints and probabilistic knowledge in the OT problem; second, the current formalism does not address the question of uncertainty in the marginals, lacking therefore the mechanisms to design robust solutions. By viewing the OT problem as the optimal design of a probability density function with marginal constraints, we prove that OT is an instance of the more generic FPD framework. In this new setting, we can furnish the OT framework with the necessary mechanisms for processing probabilistic constraints and deriving uncertainty quantifiers, hence establishing a new extended framework, called FPD-OT. Our main contribution in this paper is to establish the connection between OT and FPD, providing new theoretical insights for both. This will lay the foundations for the application of FPD-OT in a subsequent work, notably in processing more sophisticated knowledge constraints, as well as in designing robust solutions in the case of uncertain marginals.
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Europe > France > Hauts-de-France > Nord > Lille (0.04)
Posterior Regularized Bayesian Neural Network Incorporating Soft and Hard Knowledge Constraints
Huang, Jiayu, Pang, Yutian, Liu, Yongming, Yan, Hao
Neural Networks (NNs) have been widely {used in supervised learning} due to their ability to model complex nonlinear patterns, often presented in high-dimensional data such as images and text. However, traditional NNs often lack the ability for uncertainty quantification. Bayesian NNs (BNNS) could help measure the uncertainty by considering the distributions of the NN model parameters. Besides, domain knowledge is commonly available and could improve the performance of BNNs if it can be appropriately incorporated. In this work, we propose a novel Posterior-Regularized Bayesian Neural Network (PR-BNN) model by incorporating different types of knowledge constraints, such as the soft and hard constraints, as a posterior regularization term. Furthermore, we propose to combine the augmented Lagrangian method and the existing BNN solvers for efficient inference. The experiments in simulation and two case studies about aviation landing prediction and solar energy output prediction have shown the knowledge constraints and the performance improvement of the proposed model over traditional BNNs without the constraints.
- North America > United States > Arizona (0.04)
- North America > United States > Virginia > Hampton (0.04)
- North America > United States > Georgia > Clayton County (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Transportation > Air (1.00)
- Energy > Renewable > Solar (1.00)
- Energy > Power Industry (0.88)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
SaDe: Learning Models that Provably Satisfy Domain Constraints
Goyal, Kshitij, Dumancic, Sebastijan, Blockeel, Hendrik
With increasing real world applications of machine learning, models are often required to comply with certain domain based requirements, e.g., safety guarantees in aircraft systems, legal constraints in a loan approval model. A natural way to represent these properties is in the form of constraints. Including such constraints in machine learning is typically done by the means of regularization, which does not guarantee satisfaction of the constraints. In this paper, we present a machine learning approach that can handle a wide variety of constraints, and guarantee that these constraints will be satisfied by the model even on unseen data. We cast machine learning as a maximum satisfiability problem, and solve it using a novel algorithm SaDe which combines constraint satisfaction with gradient descent. We demonstrate on three use cases that this approach learns models that provably satisfy the given constraints.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Quasi-Equivalence Discovery for Zero-Shot Emergent Communication
Bullard, Kalesha, Kiela, Douwe, Pineau, Joelle, Foerster, Jakob
Effective communication is an important skill for enabling information exchange in multi-agent settings and emergent communication is now a vibrant field of research, with common settings involving discrete cheap-talk channels. Since, by definition, these settings involve arbitrary encoding of information, typically they do not allow for the learned protocols to generalize beyond training partners. In contrast, in this work, we present a novel problem setting and the Quasi-Equivalence Discovery (QED) algorithm that allows for zero-shot coordination (ZSC), i.e., discovering protocols that can generalize to independently trained agents. Real world problem settings often contain costly communication channels, e.g., robots have to physically move their limbs, and a non-uniform distribution over intents. We show that these two factors lead to unique optimal ZSC policies in referential games, where agents use the energy cost of the messages to communicate intent. Other-Play was recently introduced for learning optimal ZSC policies, but requires prior access to the symmetries of the problem. Instead, QED can iteratively discovers the symmetries in this setting and converges to the optimal ZSC policy.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Learning Bayesian networks from demographic and health survey data
Kitson, Neville Kenneth, Constantinou, Anthony C.
Child mortality from preventable diseases such as pneumonia and diarrhoea in low and middle-income countries remains a serious global challenge. We combine knowledge with available Demographic and Health Survey (DHS) data from India, to construct Bayesian Networks (BNs) and investigate the factors associated with childhood diarrhoea. We make use of freeware tools to learn the graphical structure of the DHS data with score-based, constraint-based, and hybrid structure learning algorithms. We investigate the effect of missing values, sample size, and knowledge-based constraints on each of the structure learning algorithms and assess their accuracy with multiple scoring functions. Weaknesses in the survey methodology and data available, as well as the variability in the BNs generated, mean that is not possible to learn a definitive causal BN from data. However, knowledge-based constraints are found to be useful in reducing the variation in the graphs produced by the different algorithms, and produce graphs which are more reflective of the likely influential relationships in the data. Furthermore, valuable insights are gained into the performance and characteristics of the structure learning algorithms. Two score-based algorithms in particular, TABU and FGES, demonstrate many desirable qualities; a) with sufficient data, they produce a graph which is similar to the reference graph, b) they are relatively insensitive to missing values, and c) behave well with knowledge-based constraints. The results provide a basis for further investigation of the DHS data and for a deeper understanding of the behaviour of the structure learning algorithms when applied to real-world settings.
- Asia > India (0.24)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (7 more...)
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
Domain Constraint Approximation based Semi Supervision
Wu, Yifu, Wei, Jin, Roche, Rigoberto
Deep learning for supervised learning has achieved astonishing performance in various machine learning applications. However, annotated data is expensive and rare. In practice, only a small portion of data samples are annotated. Pseudo-ensembling-based approaches have achieved state-of-the-art results in computer vision related tasks. However, it still relies on the quality of an initial model built by labeled data. Less labeled data may degrade model performance a lot. Domain constraint is another way regularize the posterior but has some limitation. In this paper, we proposed a fuzzy domain-constraint-based framework which loses the requirement of traditional constraint learning and enhances the model quality for semi supervision. Simulations results show the effectiveness of our design.
- North America > United States > Ohio > Summit County > Akron (0.05)
- North America > United States > Ohio > Cuyahoga County > Cleveland (0.04)